Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors
نویسندگان
چکیده
We introduce a variational Bayesian neural network where the parameters are governed via a probability distribution on random matrices. Specifically, we employ a matrix variate Gaussian (Gupta & Nagar, 1999) parameter posterior distribution where we explicitly model the covariance among the input and output dimensions of each layer. Furthermore, with approximate covariance matrices we can achieve a more efficient way to represent those correlations that is also cheaper than fully factorized parameter posteriors. We further show that with the “local reprarametrization trick” (Kingma et al., 2015) on this posterior distribution we arrive at a Gaussian Process (Rasmussen, 2006) interpretation of the hidden units in each layer and we, similarly with (Gal & Ghahramani, 2015), provide connections with deep Gaussian processes. We continue in taking advantage of this duality and incorporate “pseudo-data” (Snelson & Ghahramani, 2005) in our model, which in turn allows for more efficient posterior sampling while maintaining the properties of the original model. The validity of the proposed approach is verified through extensive experiments.
منابع مشابه
Noisy Natural Gradient as Variational Inference
Variational Bayesian neural nets combine the flexibility of deep learning with Bayesian uncertainty estimation. Unfortunately, there is a tradeoff between cheap but simple variational families (e.g. fully factorized) or expensive and complicated inference procedures. We show that natural gradient ascent with adaptive weight noise implicitly fits a variational posterior to maximize the evidence ...
متن کاملEfficient Modeling of Latent Information in Supervised Learning using Gaussian Processes
Often in machine learning, data are collected as a combination of multiple conditions, e.g., the voice recordings of multiple persons, each labeled with an ID. How could we build a model that captures the latent information related to these conditions and generalize to a new one with few data? We present a new model called Latent Variable Multiple Output Gaussian Processes (LVMOGP) and that all...
متن کاملA Structured Variational Auto-encoder for Learning Deep Hierarchies of Sparse Features
In this note we present a generative model of natural images consisting of a deep hierarchy of layers of latent random variables, each of which follows a new type of distribution that we call rectified Gaussian. These rectified Gaussian units allow spike-and-slab type sparsity, while retaining the differentiability necessary for efficient stochastic gradient variational inference. To learn the ...
متن کاملApproximate Inference for Deep Latent Gaussian Mixtures
Deep latent Gaussian models (DLGMs) composed of density and inference networks [14]—the pipeline that defines a Variational Autoencoder [8]—have achieved notable success on tasks ranging from image modeling [3] to semi-supervised classification [6, 11]. However, the approximate posterior in these models is usually chosen to be a factorized Gaussian, thereby imposing strong constraints on the po...
متن کاملStructured Variational Inference for Coupled Gaussian Processes
Sparse variational approximations allow for principled and scalable inference in Gaussian Process (GP) models. In settings where several GPs are part of the generative model, theses GPs are a posteriori coupled. For many applications such as regression where predictive accuracy is the quantity of interest, this coupling is not crucial. Howewer if one is interested in posterior uncertainty, it c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016